Beyond Metadata: Enriching life science publications in Livivo with semantic entities from the linked data cloud
نویسندگان
چکیده
Queries in literature search engines are usually conducted on metadata derived from scientific publications. The search engine LIVIVO holds a corpus of 63 Million life science publications. About 25 Million publications in LIVIVO are taken from PubMed that have annotations with Medical Subject Headings (MeSH). The other publications have heterogeneous keyword annotations. Hence, a workflow is developed using the Unstructured Information Management Architecture (UIMA) to enrich publications from LIVIVO with semantic annotations. The UIMA analysis engine ConceptMapper employs entity recognition based on dictionaries developed using MeSH, the pharmaceutical database DrugBank, and the multilingual agricultural vocabulary AGROVOC . Additionally, ontological relationships amongst the semantic entities are preserved by using the graph database Neo4j. The ontological information is derived from the MeSH tree, the Anatomical Therapeutic Chemical classification system (ATC) for pharmaceuticals and the AGROVOC tree. The ontological structure of semantic entities enables functionalities like query expansion, the aggregation of search results, and conceptbased ranking algorithms. Demo: http://labs.livivo.de JSON-LD: https://datahub.io/dataset/livtdm
منابع مشابه
Linkitup: Semantic Publishing of Research Data
Linkitup is a Web-based dashboard for enrichment of research output published via data repository services. It takes metadata entered through Figshare.com and tries to find equivalent terms, categories, persons or entities on the Linked Data cloud and several Web 2.0 services. It extracts references from publications, and tries to find the corresponding Digital Object Identifier (DOI). Linkitup...
متن کاملBridging the Gap between Linked Data and the Semantic Desktop
The exponential growth of the World Wide Web in the last decade brought an explosion in the information space, which has important consequences also in the area of scientific research. Finding relevant work in a particular field and exploring the links between publications is currently a cumbersome task. Similarly, on the desktop, managing the publications acquired over time can represent a rea...
متن کاملLinking Entities in Scientific Metadata
Linked entity data in metadata records builds a foundation for the Semantic Web. Even though metadata records contain rich entity data, there is no linking between associated entities such as persons, datasets, projects, publications, or organizations. We conducted a small experiment using the dataset collection from the Hubbard Brook Ecosystem Study (HBES), in which we converted the entities a...
متن کاملبررسی واکنش موتورهای کاوش وب به پیشینههای فرادادهای مبتنی برروش ترکیبی دادههای خرد و روش دادههای پیوندی
The purpose of this research was to find out the reaction of Web Search Engines to Metadata records created based on the combined method of Rich Snippets and Linked Data. 200 metadata records in two groups (100 records as the control group with the normal structure and, 100 records created based on microdata and implemented in RDF/XML as experimental group) extracted from the information gatewa...
متن کاملspatial@linkedscience - Exploring the Research Field of GIScience with Linked Data
Metadata for scientific publications contain various explicit and implicit spatio-temporal references. Data on conference locations as well as author and editor affiliations – both changing over time – enable insights into the geographic distribution of scientific fields and particular specializations. At the same time, these byproducts of scientific bibliographies offer a great opportunity to ...
متن کامل